Damietta Governorate
JEEM: Vision-Language Understanding in Four Arabic Dialects
Kadaoui, Karima, Atwany, Hanin, Al-Ali, Hamdan, Mohamed, Abdelrahman, Mekky, Ali, Tilga, Sergei, Fedorova, Natalia, Artemova, Ekaterina, Aldarmaki, Hanan, Kementchedjhieva, Yova
We introduce JEEM, a benchmark designed to evaluate Vision-Language Models (VLMs) on visual understanding across four Arabic-speaking countries: Jordan, The Emirates, Egypt, and Morocco. JEEM includes the tasks of image captioning and visual question answering, and features culturally rich and regionally diverse content. This dataset aims to assess the ability of VLMs to generalize across dialects and accurately interpret cultural elements in visual contexts. In an evaluation of five prominent open-source Arabic VLMs and GPT-4V, we find that the Arabic VLMs consistently underperform, struggling with both visual understanding and dialect-specific generation. While GPT-4V ranks best in this comparison, the model's linguistic competence varies across dialects, and its visual understanding capabilities lag behind. This underscores the need for more inclusive models and the value of culturally-diverse evaluation paradigms.
- Asia > Middle East > Jordan (0.25)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- Asia > Singapore (0.04)
- (16 more...)
- Leisure & Entertainment (1.00)
- Health & Medicine (1.00)
- Transportation > Ground (0.46)
BD-SAT: High-resolution Land Use Land Cover Dataset & Benchmark Results for Developing Division: Dhaka, BD
Paul, Ovi, Nayem, Abu Bakar Siddik, Sarker, Anis, Ali, Amin Ahsan, Amin, M Ashraful, Rahman, AKM Mahbubur
Land Use Land Cover (LULC) analysis on satellite images using deep learning-based methods is significantly helpful in understanding the geography, socio-economic conditions, poverty levels, and urban sprawl in developing countries. Recent works involve segmentation with LULC classes such as farmland, built-up areas, forests, meadows, water bodies, etc. Training deep learning methods on satellite images requires large sets of images annotated with LULC classes. However, annotated data for developing countries are scarce due to a lack of funding, absence of dedicated residential/industrial/economic zones, a large population, and diverse building materials. BD-SAT provides a high-resolution dataset that includes pixel-by-pixel LULC annotations for Dhaka metropolitan city and surrounding rural/urban areas. Using a strict and standardized procedure, the ground truth is created using Bing satellite imagery with a ground spatial distance of 2.22 meters per pixel. A three-stage, well-defined annotation process has been followed with support from GIS experts to ensure the reliability of the annotations. We performed several experiments to establish benchmark results. The results show that the annotated BD-SAT is sufficient to train large deep learning models with adequate accuracy for five major LULC classes: forest, farmland, built-up areas, water bodies, and meadows.
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.62)
- Africa > Ethiopia (0.04)
- Africa > Uganda (0.04)
- (18 more...)
Efficient liver segmentation with 3D CNN using computed tomography scans
Humady, Khaled, Al-Saeed, Yasmeen, Eladawi, Nabila, Elgarayhi, Ahmed, Elmogy, Mohammed, Sallah, Mohammed
The liver is one of the most critical metabolic organs in vertebrates due to its vital functions in the human body, such as detoxification of the blood from waste products and medications. Liver diseases due to liver tumors are one of the most common mortality reasons around the globe. Hence, detecting liver tumors in the early stages of tumor development is highly required as a critical part of medical treatment. Many imaging modalities can be used as aiding tools to detect liver tumors. Computed tomography (CT) is the most used imaging modality for soft tissue organs such as the liver. This is because it is an invasive modality that can be captured relatively quickly. This paper proposed an efficient automatic liver segmentation framework to detect and segment the liver out of CT abdomen scans using the 3D CNN DeepMedic network model. Segmenting the liver region accurately and then using the segmented liver region as input to tumors segmentation method is adopted by many studies as it reduces the false rates resulted from segmenting abdomen organs as tumors. The proposed 3D CNN DeepMedic model has two pathways of input rather than one pathway, as in the original 3D CNN model. In this paper, the network was supplied with multiple abdomen CT versions, which helped improve the segmentation quality. The proposed model achieved 94.36%, 94.57%, 91.86%, and 93.14% for accuracy, sensitivity, specificity, and Dice similarity score, respectively. The experimental results indicate the applicability of the proposed method.
- North America (0.04)
- Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)
- Atlantic Ocean > Mediterranean Sea > Adriatic Sea (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Therapeutic Area > Hepatology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- (2 more...)